(Generalized) Linear Regression on Microaggregated Data - From Nuisance Parameter Optimization to Partial Identification
نویسندگان
چکیده
Protecting sensitive micro data prior to publishing or passing the data itself on is a crucial aspect: A trade-off between sufficient disclosure control and analyzability needs to be found. This paper presents a starting point to evaluate the effect of k-anonymity microaggregated data in (generalized) linear regression. Taking a rigorous imprecision perspective, microaggregated data are understood inducing a set X of potentially true data. Based on this representation two conceptually different approaches deriving estimations from the ideal likelihood are discussed. The first one picks a single element of X, for instance by naively treating the microaggregated data as true ones or by introducing a maximax approach taking the elements of X as nuisance parameters to be optimized. The second one seeks, in the spirit of Partial Identification, the set of all maximum likelihood estimators compatible with the elements of X, thus creating cautious estimators. As the simulation study corroborates, the obtained sets of estimators of the latter approach are still precise enough to be practically relevant.
منابع مشابه
The new robust conic GPLM method with an application to finance: prediction of credit default
This paper contributes to classification and identification in modern finance through advanced optimization. In the last few decades, financial misalignments and, thereby, financial crises have been increasing in numbers due to the rearrangement of the financial world. In this study, as one of the most remarkable of these, countries’ debt crises, which result from illiquidity, are tried to pred...
متن کاملBayesian Inference for Spatial Beta Generalized Linear Mixed Models
In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...
متن کاملDifferenced-Based Double Shrinking in Partial Linear Models
Partial linear model is very flexible when the relation between the covariates and responses, either parametric and nonparametric. However, estimation of the regression coefficients is challenging since one must also estimate the nonparametric component simultaneously. As a remedy, the differencing approach, to eliminate the nonparametric component and estimate the regression coefficients, can ...
متن کاملUsing a combination of genetic algorithm and particle swarm optimization algorithm for GEMTIP modeling of spectral-induced polarization data
The generalized effective-medium theory of induced polarization (GEMTIP) is a newly developed relaxation model that incorporates the petro-physical and structural characteristics of polarizable rocks in the grain/porous scale to model their complex resistivity/conductivity spectra. The inversion of the GEMTIP relaxation model parameter from spectral-induced polarization data is a challenging is...
متن کاملNon-linear Fractional-Order Chaotic Systems Identification with Approximated Fractional-Order Derivative based on a Hybrid Particle Swarm Optimization-Genetic Algorithm Method
Although many mathematicians have searched on the fractional calculus since many years ago, but its application in engineering, especially in modeling and control, does not have many antecedents. Since there are much freedom in choosing the order of differentiator and integrator in fractional calculus, it is possible to model the physical systems accurately. This paper deals with time-domain id...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017